-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ES|QL] prune columns when using fork #137907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[ES|QL] prune columns when using fork #137907
Conversation
|
Hi @pmpailis, I've created a changelog YAML for you. |
708291d to
52893e6
Compare
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
run elasticsearch-ci/part-1 |
carlosdelest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Very good testing! 🔝
I have two questions regarding the pruning branches and the need to have a specific AttributeSet for ignoring IDs.
| * This is useful for Fork plans, where branches may have different Attribute IDs but share a common output schema, | ||
| * allowing equality checks of used attributes based on their names. | ||
| */ | ||
| public static class ForkBuilder extends Builder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you want to avoid IDs, I think you can reuse Attribute.IdIgnoringWrapper.
See here for an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking into that, but I think that it introduced too much clutter, as it needed to be bidirectional (and preferred to keep PruneColumns as untouched as possible).
Will take another look though as I could very well be missing something, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason for a new builder, was to take advantage of pruneUnusedAndAddReferences so as to reuse as much code as possible through the existing check in
if (used.contains(attr)) ...
I'm looking into whether we can change the attributes initially passed to the builder and avoid this new instance.
...in/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PruneColumnsInForkBranches.java
Outdated
Show resolved
Hide resolved
...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
Show resolved
Hide resolved
…ng functionality in PruneColumns
carlosdelest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM - though I suggest that @ioanatia takes a look at it before merging as she's the FORK expert 👍
In this PR we add support for column pruning for
FORKbranches.The main idea is that we apply pruning in two steps:
FORK's output based on the actual needed attributes inPruneColumnsrule.FORKplan independently throughpruneSubPlan, with the same used params as base (i.e.FORK's output). We then proceed to compute separately any additional needed params for each branch and keep/remove plans as per usual inPruneColumns.Closes #136365